A framework for comparing heterogeneous objects: on the similarity measurements for fuzzy, numerical and categorical attributes

نویسندگان

  • Yasmina Bashon
  • Daniel Neagu
  • Mick J. Ridley
چکیده

Real-world data collections are often heterogeneous (represented by a set of mixed attributes data types: numerical, categorical and fuzzy); since most available similarity measures can only be applied to one type of data, it becomes essential to construct an appropriate similarity measure for comparing such complex data. In this paper, a framework of new and unified similarity measures is proposed for comparing heterogeneous objects described by numerical, categorical and fuzzy attributes. Examples are used to illustrate, compare and discuss the applications and efficiency of the proposed approach to heterogeneous data comparison and clustering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Soft Computing A Framework for Comparing Heterogeneous Objects: on the Similarity Measurements for Fuzzy, Numerical and Categorical Attributes

Real-world data collections are often heterogeneous (represented by a set of mixed attributes data types: numerical, categorical and fuzzy); since most available similarity measures can only be applied to one type of data, it becomes essential to construct an appropriate similarity measure for comparing such complex. In this paper, a framework of new and unified similarity measures is proposed ...

متن کامل

A Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset

Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...

متن کامل

Numerical and Categorical Attributes Data Clustering Using K- Modes and Fuzzy K-Modes

Most of the existing clustering approaches are applicable to purely numerical or categorical data only, but not the both. In general, it is a nontrivial task to perform clustering on mixed data composed of numerical and categorical attributes because there exists an awkward gap between the similarity metrics for categorical and numerical data. This paper therefore presents a general clustering ...

متن کامل

Grouping Objects to Homogeneous Classes Satisfying Requisite Mass

Grouping datasets plays an important role in many scientific researches. Depending on data features and applications, different constrains are imposed on groups, while having groups with similar members is always a main criterion. In this paper, we propose an algorithm for grouping the objects with random labels, nominal features having too many nominal attributes. In addition, the size constra...

متن کامل

ارائه یک الگوریتم خوشه بندی برای داده های دسته ای با ترکیب معیارها

Clustering is one of the main techniques in data mining. Clustering is a process that classifies data set into groups. In clustering, the data in a cluster are the closest to each other and the data in two different clusters have the most difference. Clustering algorithms are divided into two categories according to the type of data: Clustering algorithms for numerical data and clustering algor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Soft Comput.

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2013